Report: Performance comparison between C2075 and P100 GPU cards using cosmological correlation functions

نویسندگان

  • Miguel Cárdenas Montes
  • Iván Méndez-Jiménez
  • Juan José Rodríguez-Vázquez
  • José María Hernández Calama
چکیده

In this report, some cosmological correlation functions are used to evaluate the differential performance between C2075 and P100 GPU cards. In the past, the correlation functions used in this work have been widely studied and exploited on some previous GPU architectures. The analysis of the performance indicates that a speedup in the range from 13 to 15 is achieved without any additional optimization process for the P100 card.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Computation of the Kleene Star in Max-Plus Algebra using a CUDA GPU

This research aims to accelerate the computation of the Kleene star in max-plus algebra using CUDA technology on graphics processing units (GPUs). The target module is the Kleene star of a weighted adjacency matrix for directed acyclic graph (DAGs) which plays an essential role in calculating the earliest and/or latest schedule for a class of discrete event systems. In recent NVIDIA GPU cards, ...

متن کامل

GPU-based simulation of brain neuron models

Faculty of Electrical Engineering, Mathematics and Computer Science CE-MS-2013-10 The human brain is an incredible system which can process, store, and transfer information with high speed and volume. Inspired by such system, engineers and scientists are cooperating to construct a digital brain with these characteristics. The brain is composed by billions of neurons which can be modeled by math...

متن کامل

Finite Element Matrix Generation on a Gpu

This paper presents an efficient technique for fast generation of sparse systems of linear equations arising in computational electromagnetics in a finite element method using higher order elements. The proposed approach employs a graphics processing unit (GPU) for both numerical integration and matrix assembly. The performance results obtained on a test platform consisting of a Fermi GPU (1x T...

متن کامل

Porting of the DBCSR Library for Sparse Matrix-Matrix Multiplications to Intel Xeon Phi Systems

Multiplication of two sparse matrices is a key operation in the simulation of the electronic structure of systems containing thousands of atoms and electrons. The highly optimized sparse linear algebra library DBCSR (Distributed Block Compressed Sparse Row) has been specifically designed to efficiently perform such sparse matrix-matrix multiplications. This library is the basic building block f...

متن کامل

ASC 2 GPU Stream Compilation to Graphics Cards

Modern Graphics Processing Units (GPUs) offer vast acceleration opportunities for general computation as well as for graphics. As an additional acceleration medium the GPU can compete favourably with established media such as FPGAs. In this report we present a system for taking code written as a streaming abstraction and compiling it to run on a GPU. That same code can, with minor changes, be c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1709.03264  شماره 

صفحات  -

تاریخ انتشار 2017